Find this repository: https://github.com/libjohn/workshop_textmining
Much of this review comes from the site: https://juliasilge.github.io/tidytext/
The primary library package tidytext enables all kinds of text mining. See Also this helpful free online book: Text Mining with R: A Tidy Approach by Silge and Robinson
Data
We’ll look at some books by Jane Austen, an 18th century novelist. Austen explored women and marriage within the British upper class. The novelist has a unique and well earned following within literature. Her works is consistently discussed and honored. To this day, Austen’s novels are the source of many adaptations, written and on-screen. Through the janeaustenr package we can access and mine the text of six Austen novels. We can call the collection of novels a corpra. An individual novel is a corpus.
Austen is best know for six published works:
Data Cleaning
Text mining typically requires a lot of data cleaning. In this case, we start with the janeaustenr collection that has already been cleaned. Nonetheless, further data wrangling is required. First, identifying a line number for each line of text in each book.
Tokens
To work with these data as a tidy dataset, we need to restructure the data through tokenization. In our case a token is a single word. We want one-token-per-row. The unnest_tokens() function (tidytext package) will convert a data frame with a text column into the one-token-per-row format.
Token
Tokenization
defined
The default tokenizing mode is “words”. With the unnest_tokens() function, tokens can be: words, characters, character_shingles, ngrams, skip_ngrams, sentences, lines, paragraphs, regex, tweets, and ptb (Penn Treebank).
Process
- Group by line number (above)
- Make each single word a token
Now that the data is in the one-word-per-row format, we can manipulate it with tidy tools like dplyr.
Stop Words
tidytext::get_stopwords()
Remove stop-words from the books.
Joining, by = "word"
Join types

Customize your dictionaries
You can customize stop-words data frames, sentiment data frames, etc.
There are various stop words dictionaries. Here we add the stop word, “farfegnugen” to a custom dictionary. If Jane Austen ever used the word “farfegnugen” that would be weird, or bad. So we will take pains to not calculate the sentiment of that word - whether or not the term shows up in a sentiment dictionary. That is, we will remove the word by making it a part of a customized stop-words dictionary.
[1] "snowball" "stopwords-iso" "misc" "smart" "marimo" "ancient" "nltk"
[8] "perseus"
[1] "da" "de" "en" "es" "fi" "fr" "hu" "ir" "it" "nl" "no" "pt" "ro" "ru" "sv"
Calculate word frequency
How many Austen countable words are there if we remove snowball stop-words? There are 14375 countable words.
Word clouds
Basic word cloud
A non-interactive word cloud.

Your Turn: Exercise 1
Goal: Make a basic word cloud for the novel, Pride and Predjudice, pride_prej_novel
- Prepare
- Tokenize
pride_prej_novel with unnest_tokens()
- Remove stop-words
- calculate word frequency
- make a simple wordcloud
Sentiment Analysis
get_sentiments()
Let’s see what positive words exist in the bing dictionary. Then, count the frequency of those positive words that exist in Emma.
Joining, by = "word"
Prepare to visualize sentiment score
Match all the Austen books to the bing sentiment dictionary. Count the word frequency.
Joining, by = "word"
Calculate sentiment
Algorithm: sentiment = positive - negative
Define a section of text.
"Small sections of text may not have enough words in them to get a good estimate of sentiment while really large sections can wash out narrative structure. For these books, using 80 lines works well, but this can vary depending on individual texts… – Text Mining with R
bing <- get_sentiments("bing")
janeaustensentiment <- tidy_books %>%
inner_join(bing) %>%
count(book, index = line %/% 80, sentiment) %>% # `%/%` = int division ; 80 lines / section
pivot_wider(names_from = sentiment, values_from = n, values_fill = 0) %>% # spread(sentiment, n, fill = 0)
mutate(sentiment = positive - negative) # ALGO!!!
Joining, by = "word"
janeaustensentiment
Viz it

Preparation: Most common positive and negative words
Joining, by = "word"
Viz it too

Dictionaries
What other dictionaries are available? How to choose?
Afinn
What words in Emma match the AFINN dictionary?
Joining, by = "word"
Joining, by = "word"
Make Sections
Just as we calculated sentiment, above, make sections of 80 words then calculate sentiment.
`summarise()` ungrouping output (override with `.groups` argument)
Viz it


LS0tDQp0aXRsZTogIlNlbnRpbWVudCBBbmFseXNpcyINCmF1dGhvcjogIkpvaG4gTGl0dGxlIg0KZGF0ZTogImByIFN5cy5EYXRlKClgIg0KYWJzdHJhY3Q6ICJTQSA9IGFsZ29yaXRobWljYWxseSBtYXBwaW5nIHRoZSBlbW90aW9uIG9yIG9waW5pb24gb2YgYSB0ZXh0LlxuXG4iDQpvdXRwdXQ6DQogIHJtZGZvcm1hdHM6Omh0bWxfY2xlYW46DQogICAgaGlnaGxpZ2h0OiBrYXRlDQogICAgbGlnaHRib3g6IFRSVUUNCiAgICB0aHVtYm5haWxzOiBUUlVFDQogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQNCi0tLQ0KDQpGaW5kIHRoaXMgcmVwb3NpdG9yeTogIGh0dHBzOi8vZ2l0aHViLmNvbS9saWJqb2huL3dvcmtzaG9wX3RleHRtaW5pbmcNCg0KTXVjaCBvZiB0aGlzIHJldmlldyBjb21lcyBmcm9tIHRoZSBzaXRlOiAgaHR0cHM6Ly9qdWxpYXNpbGdlLmdpdGh1Yi5pby90aWR5dGV4dC8NCg0KVGhlIHByaW1hcnkgbGlicmFyeSBwYWNrYWdlIGB0aWR5dGV4dGAgZW5hYmxlcyBhbGwga2luZHMgb2YgdGV4dCBtaW5pbmcuIFNlZSBBbHNvIHRoaXMgaGVscGZ1bCBmcmVlIG9ubGluZSBib29rOiBbVGV4dCBNaW5pbmcgd2l0aCBSOiBBIFRpZHkgQXBwcm9hY2hdKGh0dHBzOi8vd3d3LnRpZHl0ZXh0bWluaW5nLmNvbS8pIGJ5IFNpbGdlIGFuZCBSb2JpbnNvbg0KDQpgYGB7cn0NCmxpYnJhcnkoamFuZWF1c3RlbnIpDQpsaWJyYXJ5KHRpZHl2ZXJzZSkNCmxpYnJhcnkodGlkeXRleHQpDQpsaWJyYXJ5KHdvcmRjbG91ZDIpDQpgYGANCg0KYGBge3IgZWNobz1GQUxTRX0NCmh0bWx0b29sczo6aW1nKHNyYyA9IGtuaXRyOjppbWFnZV91cmkoaGVyZTo6aGVyZSgiaW1hZ2VzIiwgIlJmdW5fbG9nby5wbmciKSksDQphbHQgPSAnUmZ1bicsDQpzdHlsZSA9ICdwb3NpdGlvbjphYnNvbHV0ZTsgYm90dG9tOjE1cHg7IGxlZnQ6MDsgcGFkZGluZzo1cHg7IGJvcmRlcjowcHg7JykNCg0KaHRtbHRvb2xzOjppbWcoc3JjID0ga25pdHI6OmltYWdlX3VyaShoZXJlOjpoZXJlKCJpbWFnZXMiLCAiQ0RWUy1sb2dvX3NtX1NwcmluZzIwMjAucG5nIikpLA0KYWx0ID0gJ1JmdW4nLA0Kc3R5bGUgPSAncG9zaXRpb246YWJzb2x1dGU7IGJvdHRvbTowOyByaWdodDowOyBwYWRkaW5nOjVweDsgYm9yZGVyOjBweDsnKQ0KYGBgDQoNCg0KIyMgRGF0YQ0KDQpXZSdsbCBsb29rIGF0IHNvbWUgYm9va3MgYnkgW0phbmUgQXVzdGVuXShodHRwczovL2VuLndpa2lwZWRpYS5vcmcvd2lraS9KYW5lX0F1c3RlbiksIGFuIDE4dGggY2VudHVyeSBub3ZlbGlzdC4gQXVzdGVuIGV4cGxvcmVkIHdvbWVuIGFuZCBtYXJyaWFnZSB3aXRoaW4gdGhlIEJyaXRpc2ggdXBwZXIgY2xhc3MuICBUaGUgbm92ZWxpc3QgaGFzIGEgdW5pcXVlIGFuZCB3ZWxsIGVhcm5lZCBmb2xsb3dpbmcgd2l0aGluIGxpdGVyYXR1cmUuIEhlciB3b3JrcyBpcyBjb25zaXN0ZW50bHkgZGlzY3Vzc2VkIGFuZCBob25vcmVkLiBUbyB0aGlzIGRheSwgQXVzdGVuJ3Mgbm92ZWxzIGFyZSB0aGUgc291cmNlIG9mIG1hbnkgYWRhcHRhdGlvbnMsIHdyaXR0ZW4gYW5kIG9uLXNjcmVlbi4gIFRocm91Z2ggdGhlIGBqYW5lYXVzdGVucmAgcGFja2FnZSB3ZSBjYW4gYWNjZXNzIGFuZCBtaW5lIHRoZSB0ZXh0IG9mIHNpeCBBdXN0ZW4gbm92ZWxzLiAgV2UgY2FuIGNhbGwgdGhlIGNvbGxlY3Rpb24gb2Ygbm92ZWxzIGEgY29ycHJhLiAgQW4gaW5kaXZpZHVhbCBub3ZlbCBpcyBhIGNvcnB1cy4NCg0KYGBge3J9DQphdXN0ZW5fYm9va3MoKQ0KYGBgDQoNCkF1c3RlbiBpcyBiZXN0IGtub3cgZm9yIHNpeCBwdWJsaXNoZWQgd29ya3M6DQoNCmBgYHtyfQ0KYXVzdGVuX2Jvb2tzKCkgJT4lIA0KICBkaXN0aW5jdChib29rKQ0KYGBgDQoNCiMjIERhdGEgQ2xlYW5pbmcNCg0KVGV4dCBtaW5pbmcgdHlwaWNhbGx5IHJlcXVpcmVzIGEgbG90IG9mIGRhdGEgY2xlYW5pbmcuICBJbiB0aGlzIGNhc2UsIHdlIHN0YXJ0IHdpdGggdGhlIGBqYW5lYXVzdGVucmAgY29sbGVjdGlvbiB0aGF0IGhhcyBhbHJlYWR5IGJlZW4gY2xlYW5lZC4gIE5vbmV0aGVsZXNzLCBmdXJ0aGVyIGRhdGEgd3JhbmdsaW5nIGlzIHJlcXVpcmVkLiAgRmlyc3QsIGlkZW50aWZ5aW5nIGEgbGluZSBudW1iZXIgZm9yIGVhY2ggbGluZSBvZiB0ZXh0IGluIGVhY2ggYm9vay4NCg0KDQojIyBJZGVudGlmeSBsaW5lIG51bWJlcnMNCg0KYGBge3J9DQpvcmlnaW5hbF9ib29rcyA8LSBhdXN0ZW5fYm9va3MoKSAlPiUNCiAgZ3JvdXBfYnkoYm9vaykgJT4lDQogIG11dGF0ZShsaW5lID0gcm93X251bWJlcigpKSAlPiUgICAgICAgICAjIGlkZW50aWZ5IGxpbmUgbnVtYmVycw0KICB1bmdyb3VwKCkNCg0Kb3JpZ2luYWxfYm9va3MNCmBgYA0KDQojIyBUb2tlbnMNCg0KVG8gd29yayB3aXRoIHRoZXNlIGRhdGEgYXMgYSAqKnRpZHkqKiBkYXRhc2V0LCB3ZSBuZWVkIHRvIHJlc3RydWN0dXJlIHRoZSBkYXRhIHRocm91Z2ggX3Rva2VuaXphdGlvbl8uICBJbiBvdXIgY2FzZSBhIHRva2VuIGlzIGEgc2luZ2xlIHdvcmQuICBXZSB3YW50ICoqb25lLXRva2VuLXBlci1yb3cqKi4gVGhlIGB1bm5lc3RfdG9rZW5zKClgIGZ1bmN0aW9uICh0aWR5dGV4dCBwYWNrYWdlKSB3aWxsIGNvbnZlcnQgYSBkYXRhIGZyYW1lIHdpdGggYSB0ZXh0IGNvbHVtbiBpbnRvIHRoZSBvbmUtdG9rZW4tcGVyLXJvdyBmb3JtYXQuDQoNCioqVG9rZW4qKiAgDQoqKlRva2VuaXphdGlvbioqICANCltkZWZpbmVkXShodHRwczovL3d3dy50ZWNob3BlZGlhLmNvbS9kZWZpbml0aW9uLzEzNjk4L3Rva2VuaXphdGlvbikgIA0KDQpUaGUgZGVmYXVsdCB0b2tlbml6aW5nIG1vZGUgaXMgIndvcmRzIi4gV2l0aCB0aGUgYHVubmVzdF90b2tlbnMoKWAgZnVuY3Rpb24sIHRva2VucyBjYW4gYmU6ICAqKndvcmRzKiosIGNoYXJhY3RlcnMsIGNoYXJhY3Rlcl9zaGluZ2xlcywgKipuZ3JhbXMqKiwgc2tpcF9uZ3JhbXMsICoqc2VudGVuY2VzKiosIGxpbmVzLCBwYXJhZ3JhcGhzLCByZWdleCwgdHdlZXRzLCBhbmQgcHRiIChQZW5uIFRyZWViYW5rKS4gDQoNCiMjIyBQcm9jZXNzDQoxLiBHcm91cCBieSAqKmxpbmUgbnVtYmVyKiogKGFib3ZlKQ0KMi4gTWFrZSBlYWNoIHNpbmdsZSB3b3JkIGEgdG9rZW4NCg0KDQpgYGB7cn0NCnRpZHlfYm9va3MgPC0gb3JpZ2luYWxfYm9va3MgJT4lDQogIHVubmVzdF90b2tlbnMod29yZCwgdGV4dCkNCg0KdGlkeV9ib29rcw0KYGBgDQoNCg0KPiBOb3cgdGhhdCB0aGUgZGF0YSBpcyBpbiB0aGUgb25lLXdvcmQtcGVyLXJvdyBmb3JtYXQsIHdlIGNhbiBtYW5pcHVsYXRlIGl0IHdpdGggdGlkeSB0b29scyBsaWtlIGRwbHlyLg0KDQoNCiMjIFN0b3AgV29yZHMNCg0KYHRpZHl0ZXh0OjpnZXRfc3RvcHdvcmRzKClgDQoNClJlbW92ZSBzdG9wLXdvcmRzIGZyb20gdGhlIGJvb2tzLg0KDQpgYGB7cn0NCm1hdGNod29yZHNfYm9va3MgPC0gdGlkeV9ib29rcyAlPiUNCiAgYW50aV9qb2luKGdldF9zdG9wd29yZHMoKSkNCg0KbWF0Y2h3b3Jkc19ib29rcw0KYGBgDQoNCiMjIyBKb2luIHR5cGVzDQoNCiFbXShodHRwczovL3Bicy50d2ltZy5jb20vbWVkaWEvQjZlVVRUQUNVQUFhaExmLnBuZyAiRHBseXIgSm9pbiBEaWFncmFtIikNCg0KIyMjIEN1c3RvbWl6ZSB5b3VyIGRpY3Rpb25hcmllcw0KDQpZb3UgY2FuIGN1c3RvbWl6ZSBzdG9wLXdvcmRzIGRhdGEgZnJhbWVzLCBzZW50aW1lbnQgZGF0YSBmcmFtZXMsIGV0Yy4NCg0KVGhlcmUgYXJlIHZhcmlvdXMgX3N0b3Agd29yZHNfIGRpY3Rpb25hcmllcy4gIEhlcmUgd2UgYWRkIHRoZSBzdG9wIHdvcmQsICJmYXJmZWdudWdlbiIgdG8gYSBjdXN0b20gZGljdGlvbmFyeS4gIElmIEphbmUgQXVzdGVuIGV2ZXIgdXNlZCB0aGUgd29yZCAiZmFyZmVnbnVnZW4iIHRoYXQgd291bGQgYmUgd2VpcmQsIG9yIGJhZC4gIFNvIHdlIHdpbGwgdGFrZSBwYWlucyB0byBub3QgY2FsY3VsYXRlIHRoZSBzZW50aW1lbnQgb2YgdGhhdCB3b3JkIC0gd2hldGhlciBvciBub3QgdGhlIHRlcm0gc2hvd3MgdXAgaW4gYSBzZW50aW1lbnQgZGljdGlvbmFyeS4gIFRoYXQgaXMsIHdlIHdpbGwgcmVtb3ZlIHRoZSB3b3JkIGJ5IG1ha2luZyBpdCBhIHBhcnQgb2YgYSBjdXN0b21pemVkIHN0b3Atd29yZHMgZGljdGlvbmFyeS4NCg0KYGBge3J9DQpzdG9wd29yZHM6OnN0b3B3b3Jkc19nZXRzb3VyY2VzKCkNCnN0b3B3b3Jkczo6c3RvcHdvcmRzX2dldGxhbmd1YWdlcygic25vd2JhbGwiKQ0KDQpzdG9wd29yZHNfY3VzdG9tIDwtIHRyaWJibGUofndvcmQsIH5sZXhpY29uLA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICJmYXJmZWdudWdlbiIsICJjdXN0b20iKQ0KDQpzdG9wd29yZHNfY3VzdG9tDQoNCmdldF9zdG9wd29yZHMoc291cmNlID0gInNub3diYWxsIikNCg0KYmluZF9yb3dzKGdldF9zdG9wd29yZHMoKSwgc3RvcHdvcmRzX2N1c3RvbSkgICAgIyBUaGUgZGVmYXVsdCBpcyAic25vd2JhbGwiDQoNCmBgYA0KDQojIyMgQ2FsY3VsYXRlIHdvcmQgZnJlcXVlbmN5DQoNCkhvdyBtYW55IEF1c3RlbiBjb3VudGFibGUgd29yZHMgYXJlIHRoZXJlIGlmIHdlIHJlbW92ZSBfc25vd2JhbGxfIHN0b3Atd29yZHM/ICBUaGVyZSBhcmUgYHIgICBucm93KGRwbHlyOjpkaXN0aW5jdChtYXRjaHdvcmRzX2Jvb2tzLCB3b3JkKSlgIGNvdW50YWJsZSB3b3Jkcy4gDQoNCmBgYHtyfQ0KbWF0Y2h3b3Jkc19ib29rcyAlPiUgDQogICMgZGlzdGluY3Qod29yZCkNCiAgY291bnQod29yZCwgc29ydCA9IFRSVUUpIA0KYGBgDQoNCiMjIFdvcmQgY2xvdWRzDQoNCmBgYHtyIGludGVyYWN0aXZlIHdvcmQgY2xvdWQsIGZpZy53aWR0aD0xMH0NCm1hdGNod29yZHNfYm9va3MgJT4lDQogIGNvdW50KHdvcmQsIHNvcnQgPSBUUlVFKSAlPiUNCiAgaGVhZCgxMDApICU+JSANCiAgd29yZGNsb3VkMihzaXplID0gLjQsIHNoYXBlID0gJ3RyaWFuZ2xlLWZvcndhcmQnLCANCiAgICAgICAgICAgICBjb2xvciA9IGMoInN0ZWVsYmx1ZSIsICJmaXJlYnJpY2siLCAiZGFya29yY2hpZCIpLCANCiAgICAgICAgICAgICBiYWNrZ3JvdW5kQ29sb3IgPSAic2FsbW9uIikNCg0KYGBgDQoNCiMjIyBCYXNpYyB3b3JkIGNsb3VkDQoNCkEgbm9uLWludGVyYWN0aXZlIHdvcmQgY2xvdWQuDQoNCmBgYHtyIGJhc2ljIHdvcmQgY2xvdWQsIGZpZy5oZWlnaHQ9OCwgZmlnLndpZHRoPTh9DQptYXRjaHdvcmRzX2Jvb2tzICU+JQ0KICBjb3VudCh3b3JkKSAlPiUNCiAgd2l0aCh3b3JkY2xvdWQ6OndvcmRjbG91ZCh3b3JkLCBuLCBtYXgud29yZHMgPSAxMDApKQ0KYGBgDQoNCg0KIyMgWW91ciBUdXJuOiAgRXhlcmNpc2UgMQ0KDQpHb2FsOiBNYWtlIGEgYmFzaWMgd29yZCBjbG91ZCBmb3IgdGhlIG5vdmVsLCBfUHJpZGUgYW5kIFByZWRqdWRpY2VfLCBgcHJpZGVfcHJlal9ub3ZlbGANCg0KYS4gUHJlcGFyZSANCg0KYGBge3J9DQpwcmlkZV9wcmVqX25vdmVsIDwtIHRpYmJsZSh0ZXh0ID0gcHJpZGVwcmVqdWRpY2UpICU+JSANCiAgbXV0YXRlKGxpbmUgPSByb3dfbnVtYmVyKCkpDQpgYGANCg0KYi4gVG9rZW5pemUgYHByaWRlX3ByZWpfbm92ZWxgIHdpdGggYHVubmVzdF90b2tlbnMoKWANCg0KYGBge3J9DQoNCmBgYA0KDQpjLiBSZW1vdmUgc3RvcC13b3Jkcw0KDQpgYGB7cn0NCg0KYGBgDQoNCmQuIGNhbGN1bGF0ZSB3b3JkIGZyZXF1ZW5jeQ0KDQpgYGB7cn0NCg0KYGBgDQoNCmUuIG1ha2UgYSBzaW1wbGUgd29yZGNsb3VkDQoNCmBgYHtyfQ0KDQpgYGANCg0KDQojIyBTZW50aW1lbnQgQW5hbHlzaXMNCg0KYGdldF9zZW50aW1lbnRzKClgDQoNCkxldCdzIHNlZSB3aGF0IHBvc2l0aXZlIHdvcmRzIGV4aXN0IGluIHRoZSBiaW5nIGRpY3Rpb25hcnkuICBUaGVuLCBjb3VudCB0aGUgZnJlcXVlbmN5IG9mIHRob3NlIHBvc2l0aXZlIHdvcmRzIHRoYXQgZXhpc3QgaW4gX0VtbWFfLg0KDQpgYGB7cn0NCnBvc2l0aXZlIDwtIGdldF9zZW50aW1lbnRzKCJiaW5nIikgJT4lDQogIGZpbHRlcihzZW50aW1lbnQgPT0gInBvc2l0aXZlIikgICAgICAgICAgICAgICAgICAgICMgZ2V0IFBPU0lUSVZFIHdvcmRzDQoNCnBvc2l0aXZlIA0KDQp0aWR5X2Jvb2tzICU+JQ0KICBmaWx0ZXIoYm9vayA9PSAiRW1tYSIpICU+JSAgICAgICAgICAgICAgICAgICAgICAgICMgb25seSB0aGUgYm9vayBfZW1tYV8NCiAgc2VtaV9qb2luKHBvc2l0aXZlKSAlPiUgICAgICAgICAgICAgICAgICAgICAgICAgICAjIHNlbWlfam9pbigpDQogIGNvdW50KHdvcmQsIHNvcnQgPSBUUlVFKQ0KYGBgDQoNCiMjIyBQcmVwYXJlIHRvIHZpc3VhbGl6ZSBzZW50aW1lbnQgc2NvcmUNCg0KTWF0Y2ggYWxsIHRoZSBBdXN0ZW4gYm9va3MgdG8gdGhlIGJpbmcgc2VudGltZW50IGRpY3Rpb25hcnkuICBDb3VudCB0aGUgd29yZCBmcmVxdWVuY3kuDQoNCmBgYHtyfQ0KdGlkeV9ib29rcyAlPiUNCiAgaW5uZXJfam9pbihnZXRfc2VudGltZW50cygiYmluZyIpKSAlPiUNCiAgY291bnQoYm9vaykNCmBgYA0KDQojIyMgQ2FsY3VsYXRlIHNlbnRpbWVudA0KDQo+ICoqQWxnb3JpdGhtOioqIHNlbnRpbWVudCA9IHBvc2l0aXZlIC0gbmVnYXRpdmUNCg0KRGVmaW5lIGEgc2VjdGlvbiBvZiB0ZXh0LiAgDQoNCj4gIlNtYWxsIHNlY3Rpb25zIG9mIHRleHQgbWF5IG5vdCBoYXZlIGVub3VnaCB3b3JkcyBpbiB0aGVtIHRvIGdldCBhIGdvb2QgZXN0aW1hdGUgb2Ygc2VudGltZW50IHdoaWxlIHJlYWxseSBsYXJnZSBzZWN0aW9ucyBjYW4gd2FzaCBvdXQgbmFycmF0aXZlIHN0cnVjdHVyZS4gRm9yIHRoZXNlIGJvb2tzLCB1c2luZyA4MCBsaW5lcyB3b3JrcyB3ZWxsLCBidXQgdGhpcyBjYW4gdmFyeSBkZXBlbmRpbmcgb24gaW5kaXZpZHVhbCB0ZXh0cy4uLiAgLS0gW1RleHQgTWluaW5nIHdpdGggUl0oaHR0cHM6Ly93d3cudGlkeXRleHRtaW5pbmcuY29tL3NlbnRpbWVudC5odG1sKSAgDQoNCmBgYHtyIGVjaG89VFJVRX0NCmJpbmcgPC0gZ2V0X3NlbnRpbWVudHMoImJpbmciKQ0KDQpqYW5lYXVzdGVuc2VudGltZW50IDwtIHRpZHlfYm9va3MgJT4lIA0KICBpbm5lcl9qb2luKGJpbmcpICU+JSANCiAgY291bnQoYm9vaywgaW5kZXggPSBsaW5lICUvJSA4MCwgc2VudGltZW50KSAlPiUgICAgICAgICAgICAgICAgICAgICAgICAgICMgYCUvJWAgPSBpbnQgZGl2aXNpb24gOyA4MCBsaW5lcyAvIHNlY3Rpb24NCiAgcGl2b3Rfd2lkZXIobmFtZXNfZnJvbSA9IHNlbnRpbWVudCwgdmFsdWVzX2Zyb20gPSBuLCB2YWx1ZXNfZmlsbCA9IDApICU+JSAgICAjIHNwcmVhZChzZW50aW1lbnQsIG4sIGZpbGwgPSAwKQ0KICBtdXRhdGUoc2VudGltZW50ID0gcG9zaXRpdmUgLSBuZWdhdGl2ZSkgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICMgQUxHTyEhIQ0KICANCmphbmVhdXN0ZW5zZW50aW1lbnQNCmBgYA0KDQojIyMgVml6IGl0DQoNCmBgYHtyIHNlbnRpbWVudCBzY29yZX0NCmphbmVhdXN0ZW5zZW50aW1lbnQgJT4lDQogIGdncGxvdChhZXMoaW5kZXgsIHNlbnRpbWVudCwgKSkgKw0KICBnZW9tX2NvbChzaG93LmxlZ2VuZCA9IEZBTFNFLCBmaWxsID0gImNhZGV0Ymx1ZSIpICsNCiAgZ2VvbV9jb2woZGF0YSA9IC4gJT4lIGZpbHRlcihzZW50aW1lbnQgPCAwKSwgc2hvdy5sZWdlbmQgPSBGQUxTRSwgZmlsbCA9ICJmaXJlYnJpY2siKSArDQogIGdlb21faGxpbmUoeWludGVyY2VwdCA9IDAsIGNvbG9yID0gImdvbGRlbnJvZCIpICsNCiAgZmFjZXRfd3JhcCh+IGJvb2ssIG5jb2wgPSAyLCBzY2FsZXMgPSAiZnJlZV94IikgDQpgYGANCg0KDQojIyMgUHJlcGFyYXRpb246IE1vc3QgY29tbW9uIHBvc2l0aXZlIGFuZCBuZWdhdGl2ZSB3b3Jkcw0KDQoNCmBgYHtyfQ0KYmluZ193b3JkX2NvdW50cyA8LSB0aWR5X2Jvb2tzICU+JQ0KICBpbm5lcl9qb2luKGJpbmcpICU+JQ0KICBjb3VudCh3b3JkLCBzZW50aW1lbnQsIHNvcnQgPSBUUlVFKQ0KDQpiaW5nX3dvcmRfY291bnRzDQpgYGANCg0KIyMjIFZpeiBpdCB0b28NCg0KYGBge3IgcG9zaXRpdmUgYW5kIG5lZ2F0aXZlLCBmaWcuaGVpZ2h0PTcsIGZpZy53aWR0aD0xMH0NCmJpbmdfd29yZF9jb3VudHMgJT4lDQogIGZpbHRlcihuID4gMTcwKSAlPiUNCiAgbXV0YXRlKG4gPSBpZl9lbHNlKHNlbnRpbWVudCA9PSAibmVnYXRpdmUiLCAtIG4sIG4pKSAlPiUNCiAgZ2dwbG90KGFlcyhmY3RfcmVvcmRlcihzdHJfdG9fdGl0bGUod29yZCksIG4pLCBuLCBmaWxsID0gc3RyX3RvX3RpdGxlKHNlbnRpbWVudCkpKSArDQogIGdlb21fY29sKCkgKw0KICBjb29yZF9mbGlwKCkgKw0KICBzY2FsZV9maWxsX2JyZXdlcih0eXBlID0gInF1YWwiKSArDQogIGd1aWRlcyhmaWxsID0gZ3VpZGVfbGVnZW5kKHJldmVyc2UgPSBUUlVFKSkgKw0KICBsYWJzKHRpdGxlID0gIkZyZXF1ZW5jeSBvZiBwb3B1bGFyIHBvc2l0aXZlIGFuZCBuZWdhdGl2ZSB3b3JkcyIsDQogICAgICAgc3VidGl0bGUgPSAiSmFuZSBBdXN0ZW4gbm92ZWxzIiwNCiAgICAgICB5ID0gIkNvbXBvdW5kIHNlbnRpbWVudCBzY29yZSIsIHggPSAiIiwNCiAgICAgICBmaWxsID0gIlNlbnRpbWVudCIsIGNhcHRpb24gPSAiU291cmNlOiBsaWJyYXJ5KGphbmVhdXN0ZW5yKSIpICsNCiAgdGhlbWUocGxvdC50aXRsZS5wb3NpdGlvbiA9ICJwbG90IikNCmBgYA0KDQoNCiMjIERpY3Rpb25hcmllcw0KDQpXaGF0IG90aGVyIGRpY3Rpb25hcmllcyBhcmUgYXZhaWxhYmxlPyAgSG93IHRvIGNob29zZT8NCg0KLSBbV2l0aG91dCBEaWN0aWlvbmFyaWVzIHRoZXJlIGlzIG5vIHNlbnRpbWVudCBhbmFseXNpc10oaHR0cDovL3d3dy50aGlua2luZ29uZGF0YS5jb20vd2l0aG91dC1kaWN0aW9uYXJpZXMtbm8tc2VudGltZW50LWFuYWx5c2lzLykNCi0gW1NlbnRpbWVudCBBbmFseXNpczogQW5hbHl6aW5nIExleGljb24gUXVhbGl0eSBhbmQgRXN0aW1hdGlvbiBFcnJvcnNdKGh0dHBzOi8vcGF1bHZhbmRlcmxha2VuLmNvbS8yMDE3LzEyLzI3L3NlbnRpbWVudC1hbmFseXNpcy1sZXhpY29uLXF1YWxpdHkvKQ0KLSBbTGltaXRzIG9mIHRoZSBCaW5nLCBBRklOTiwgYW5kIE5SQyBMZXhpY29ucyB3aXRoIHRoZSBUaWR5dGV4dCBQYWNrYWdlIGluIFJdKGh0dHBzOi8vaG95ZW9sa2ltLndvcmRwcmVzcy5jb20vMjAxOC8wMi8yNS90aGUtbGltaXRzLW9mLXRoZS1iaW5nLWFmaW5uLWFuZC1ucmMtbGV4aWNvbnMtd2l0aC10aGUtdGlkeXRleHQtcGFja2FnZS1pbi1yLykNCi0gW0Nhc2UgU3R1ZHkgd2l0aCBIYXJyeSBQb3R0ZXJdKGh0dHBzOi8vYWZpdC1yLmdpdGh1Yi5pby9zZW50aW1lbnRfYW5hbHlzaXMpDQoNCmBgYHtyfQ0KaGVhZChnZXRfc2VudGltZW50cygiYmluZyIpKQ0KaGVhZChnZXRfc2VudGltZW50cygibG91Z2hyYW4iKSkNCmhlYWQoZ2V0X3NlbnRpbWVudHMoIm5yYyIpKQ0KaGVhZChnZXRfc2VudGltZW50cygiYWZpbm4iKSkNCg0KZ2V0X3NlbnRpbWVudHMoIm5yYyIpICU+JSANCiAgY291bnQoc2VudGltZW50LCBzb3J0ID0gVFJVRSkgDQoNCmBgYA0KDQojIyBBZmlubg0KDQpXaGF0IHdvcmRzIGluIF9FbW1hXyBtYXRjaCB0aGUgQUZJTk4gZGljdGlvbmFyeT8NCg0KYGBge3J9DQplbW1hX2FmaW5uIDwtIHRpZHlfYm9va3MgJT4lDQogIGZpbHRlcihib29rID09ICJFbW1hIikgJT4lIA0KICBhbnRpX2pvaW4oZ2V0X3N0b3B3b3JkcygpKSAlPiUgDQogIGlubmVyX2pvaW4oZ2V0X3NlbnRpbWVudHMoImFmaW5uIikpDQoNCmVtbWFfYWZpbm4NCmBgYA0KDQoNCmBgYHtyfQ0KZW1tYV9hZmlubiAlPiUgDQogIGNvdW50KHdvcmQsIHNvcnQgPSBUUlVFKQ0KYGBgDQoNCg0KIyMjIE1ha2UgU2VjdGlvbnMNCg0KSnVzdCBhcyB3ZSBjYWxjdWxhdGVkIHNlbnRpbWVudCwgYWJvdmUsIG1ha2Ugc2VjdGlvbnMgb2YgODAgd29yZHMgdGhlbiBjYWxjdWxhdGUgc2VudGltZW50LiAgDQoNCmBgYHtyfQ0KZW1tYV9hZmlubl9zZW50aW1lbnQgPC0gZW1tYV9hZmlubiAlPiUgDQogIG11dGF0ZSh3b3JkX2NvdW50ID0gMTpuKCksDQogICAgICAgICBpbmRleCA9IHdvcmRfY291bnQgJS8lIDgwKSAlPiUgDQogIGdyb3VwX2J5KGluZGV4KSAlPiUgDQogIHN1bW1hcmlzZShzZW50aW1lbnQgPSBzdW0odmFsdWUpKSAgICAgICAgICAgIyMgQUxHTyBzdW0gZWFjaCBBZmlubiBzY29yZSBpbiB0aGUgODAgd29yZCBzZWN0aW9uDQoNCmVtbWFfYWZpbm5fc2VudGltZW50DQoNCmBgYA0KDQojIyMgVml6IGl0DQoNCmBgYHtyIGVtbWEgd29yZCBjbG91ZH0NCmVtbWFfYWZpbm4gJT4lIA0KICBtdXRhdGUod29yZF9jb3VudCA9IDE6bigpLA0KICAgICAgICAgaW5kZXggPSB3b3JkX2NvdW50ICUvJSA4MCkgJT4lIA0KICBmaWx0ZXIoaW5kZXggPT0gMTA0KSAlPiUgDQogIGNvdW50KHdvcmQsIHNvcnQgPSBUUlVFKSAlPiUNCiAgd29yZGNsb3VkMihzaXplID0gLjQsIHNoYXBlID0gJ2RpYW1vbmQnLCANCiAgICAgICAgICAgICBiYWNrZ3JvdW5kQ29sb3IgPSAiZGFya3NlYWdyZWVuIikNCg0KYGBgDQoNCmBgYHtyIGVtbWEgYWZpbm59DQplbW1hX2FmaW5uX3NlbnRpbWVudCAlPiUgDQogIGdncGxvdChhZXMoaW5kZXgsIHNlbnRpbWVudCkpICsNCiAgZ2VvbV9jb2woYWVzKGZpbGwgPSBjdXRfaW50ZXJ2YWwoc2VudGltZW50LCBuID0gNSkpKSArDQogIGdlb21faGxpbmUoeWludGVyY2VwdCA9IDAsIGNvbG9yID0gImZvcmVzdGdyZWVuIiwgbGluZXR5cGUgPSAiZGFzaGVkIikgKw0KICBzY2FsZV9maWxsX2JyZXdlcihwYWxldHRlID0gIlJkQnUiLCBndWlkZSA9IEZBTFNFKSArDQogIHRoZW1lKHBhbmVsLmJhY2tncm91bmQgPSBlbGVtZW50X3JlY3QoZmlsbCA9ICJncmV5IiksDQogICAgICAgIHBsb3QuYmFja2dyb3VuZCA9IGVsZW1lbnRfcmVjdChmaWxsID0gImdyZXkiKSwNCiAgICAgICAgcGFuZWwuZ3JpZC5tYWpvciA9IGVsZW1lbnRfYmxhbmsoKSwNCiAgICAgICAgcGFuZWwuZ3JpZC5taW5vciA9IGVsZW1lbnRfYmxhbmsoKSkgKw0KICBsYWJzKHRpdGxlID0gIkFmaW5uIFNlbnRpbWVudCBBbmFseXNpcyBvZiBfRW1tYV8iKQ0KYGBgDQpgYGB7ciBlbW1hIGJveHBsb3Qgb2YgYWZpbm59DQplbW1hX2FmaW5uICU+JQ0KICBtdXRhdGUod29yZF9jb3VudCA9IDE6bigpLA0KICAgICAgICAgaW5kZXggPSBhcy5jaGFyYWN0ZXIod29yZF9jb3VudCAlLyUgODApKSAlPiUNCiAgZmlsdGVyKGluZGV4ID09IDEwIHwgaW5kZXggPT0gMTA0IHwgaW5kZXggPT0gMTA1KSAlPiUgDQogIGdncGxvdChhZXModmFsdWUsIGluZGV4KSkgKw0KICBnZW9tX2JveHBsb3QoKSArDQogICMgZ2VvbV9ib3hwbG90KG5vdGNoID0gVFJVRSkgKw0KICBnZW9tX2ppdHRlcigpICsNCiAgY29vcmRfZmxpcCgpICsNCiAgbGFicyh5ID0gInNlY3Rpb24iLCB4ID0gIkFmaW5uIikNCmBgYA0KDQojIyBSZXNvdXJjZXMNCg0KLSBbVGlkeXRleHQgcGFja2FnZV0oaHR0cHM6Ly9qdWxpYXNpbGdlLmdpdGh1Yi5pby90aWR5dGV4dC8pDQotIEJvb2s6ICBbVGV4dCBNaW5pbmcgd2l0aCBSXShodHRwczovL3d3dy50aWR5dGV4dG1pbmluZy5jb20vKSBieSBTaWxnZSBhbmQgUm9iaW5zb24NCi0gRGF0YSBXcmFuZ2xpbmcgd2l0aCBkcGx5cjogKFt2aWRlb10oaHR0cHM6Ly9qdWxpYXNpbGdlLmdpdGh1Yi5pby90aWR5dGV4dC8pIHwgW3dvcmtzaG9wXShodHRwczovL3JmdW4ubGlicmFyeS5kdWtlLmVkdS9wb3J0Zm9saW8vcl9mbGlwcGVkLykpDQotIERhdGEgVmlzdWFsaXphdGlvbiB3aXRoIGdncGxvdDI6IChbdmlkZW9dKGh0dHBzOi8vd2FycHdpcmUuZHVrZS5lZHUvdy84MFlFQUEvKSB8IFt3b3Jrc2hvcF0oaHR0cHM6Ly9yZnVuLmxpYnJhcnkuZHVrZS5lZHUvcG9ydGZvbGlvL2dncGxvdF93b3Jrc2hvcC8pKQ0KDQoNCi0tLQ0KDQpgYGB7ciBpbmNsdWRlPUZBTFNFfQ0KbGlicmFyeShodG1sdG9vbHMpDQp0YWdMaXN0KHJtYXJrZG93bjo6aHRtbF9kZXBlbmRlbmN5X2ZvbnRfYXdlc29tZSgpKQ0KYGBgDQoNCjxjZW50ZXI+DQpbSm9obiBMaXR0bGVdKGh0dHBzOi8vam9obmxpdHRsZS5pbmZvLykgIA0KW1JmdW5dKGh0dHBzOi8vUmZ1bi5saWJyYXJ5LmR1a2UuZWR1LykgIA0KW0NlbnRlciBmb3IgRGF0YSAmIFZpc3VhbGl6YXRpb24gU2NpZW5jZXNdKGh0dHBzOi8vbGlicmFyeS5kdWtlLmVkdS9kYXRhLykNCg0KPGkgY2xhc3M9ImZhYiBmYS1jcmVhdGl2ZS1jb21tb25zIGZhLTJ4Ij48L2k+ICZuYnNwOyA8aSBjbGFzcz0iZmFiIGZhLWNyZWF0aXZlLWNvbW1vbnMtYnkgZmEtMngiPjwvaT48aSBjbGFzcz0iZmFiIGZhLWNyZWF0aXZlLWNvbW1vbnMtbmMgZmEtMngiPjwvaT4gDQoNCkNDIEJZLU5DICANCkNyZWF0aXZlIENvbW1vbnM6ICBBdHRyaWJ1dGlvbiwgTm9uLWNvbW1lcmNpYWwgIA0KaHR0cHM6Ly9jcmVhdGl2ZWNvbW1vbnMub3JnL2xpY2Vuc2VzL2J5LW5jLzQuMC8NCjwvY2VudGVyPg0KDQombmJzcDsgIA0KDQombmJzcDsgIA==